Skip to content

Conversation

@xuxiong1
Copy link
Contributor

@xuxiong1 xuxiong1 commented Dec 4, 2025

Description

This commit introduces a feature to read translog operations in forward order
(oldest to newest) instead of the default backward order (newest to oldest).

Changes:

  • Add index.translog.read_forward setting (default: false) in IndexSettings
  • Update MultiSnapshot to support bidirectional reading based on setting

Tests:

  • testRecoveryTrimsLocalTranslogWithReadForward (RecoveryTests)
  • testSeqNoCollisionWithReadForward (IndexLevelReplicationTests)
  • testSnapshotReadOperationForward (LocalTranslogTests)

Related Issues

Resolves #20094

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • New Features

    • Added a configurable option to enable forward reading of transaction logs for snapshots and recoveries, providing an alternative to the default reverse-reading behavior.
  • Tests

    • Added tests validating forward translog reading during snapshots, replication/recovery flows, and recovery trimming to ensure correct ordering and deduplication.
  • Documentation

    • Updated changelog entry announcing forward translog read support.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 4, 2025

Walkthrough

Adds an index-level boolean setting to enable translog forward reading, threads the flag from IndexSettings through Translog into MultiSnapshot, updates MultiSnapshot to traverse snapshots forward when enabled, and adds tests for replication, recovery, and local translog snapshot behavior.

Changes

Cohort / File(s) Summary
Changelog & Index settings
CHANGELOG.md, server/src/main/java/org/opensearch/index/IndexSettings.java
Added changelog entry and new index-scoped boolean setting INDEX_TRANSLOG_READ_FORWARD_SETTING (default false); added backing field and public accessor isTranslogReadForward().
Translog snapshot traversal
server/src/main/java/org/opensearch/index/translog/MultiSnapshot.java
Constructor updated to accept boolean readForward; new readForward field; next() split to iterate snapshots forward (ascending index) when enabled or preserve existing backward iteration otherwise; deduplication logic preserved.
Translog integration
server/src/main/java/org/opensearch/index/translog/Translog.java
newMultiSnapshot() now reads indexSettings().isTranslogReadForward() and passes readForward into MultiSnapshot construction.
Replication tests
server/src/test/java/org/opensearch/index/replication/IndexLevelReplicationTests.java
Added testSeqNoCollisionWithReadForward() which enables the forward-read setting and validates peer-recovery behavior with seqNo/term collision scenarios.
Recovery tests
server/src/test/java/org/opensearch/indices/recovery/RecoveryTests.java
Added testRecoveryTrimsLocalTranslogWithReadForward() mirroring existing recovery trimming test but with forward-read enabled.
Translog unit tests
server/src/test/java/org/opensearch/index/translog/LocalTranslogTests.java
Added testSnapshotReadOperationForward() to assert that a forward-enabled translog snapshot returns operations concatenated in forward generation order.

Sequence Diagram

sequenceDiagram
    participant Config as Index Config
    participant Settings as IndexSettings
    participant Translog as Translog
    participant Multi as MultiSnapshot
    participant Files as Translog Files

    rect rgb(240,248,255)
        Config->>Settings: load index settings
        Settings->>Settings: read INDEX_TRANSLOG_READ_FORWARD_SETTING
    end

    rect rgb(255,250,240)
        Translog->>Settings: isTranslogReadForward()
        Translog->>Multi: new MultiSnapshot(snapshots, onClose, readForward)
        alt readForward == true
            Multi->>Multi: startIndex = 0 (forward)
        else
            Multi->>Multi: startIndex = translogs.length - 1 (backward)
        end
    end

    rect rgb(240,255,240)
        loop replay snapshots
            alt forward
                Multi->>Files: read snapshot at ascending index
                Files-->>Multi: operations (forward order)
            else backward
                Multi->>Files: read snapshot at descending index
                Files-->>Multi: operations (reverse order)
            end
            Multi->>Multi: deduplicate by seqNo and yield operations
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus:
    • MultiSnapshot: verify forward and backward branches are equivalent in deduplication, termination, and index advancement (off-by-one risks).
    • Translog.newMultiSnapshot: ensure readForward is propagated in all snapshot creation paths.
    • IndexSettings: confirm setting scope, default and accessor correctness.
    • New tests: check determinism and that they sufficiently exercise trimming/edge cases.

Poem

🐰 I hop through logs both new and old,
Forward now, each tale retold.
Generations march in tidy line,
SeqNos kept neat, no duplicates shine.
A little rabbit cheers this forward time!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add support for forward translog reading' is concise, clear, and accurately summarizes the main change—introducing the ability to read translog in forward order.
Description check ✅ Passed The PR description follows the template structure with a clear description of changes, linked issue reference (#20094), and completed checklist items confirming testing is included.
Linked Issues check ✅ Passed All coding requirements from #20094 are met: forward translog reading setting added, MultiSnapshot updated for bidirectional reading, and comprehensive tests (RecoveryTests, IndexLevelReplicationTests, LocalTranslogTests) verify the functionality.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing forward translog reading: settings configuration, MultiSnapshot logic, Translog integration, and corresponding tests are all within scope.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73f1414 and fe0811d.

📒 Files selected for processing (1)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
🔇 Additional comments (4)
server/src/main/java/org/opensearch/index/IndexSettings.java (4)

199-218: LGTM! Comprehensive setting definition with edge case documentation.

The setting definition is well-structured with:

  • Safe default (false) preserving existing backward-reading behavior
  • Appropriate scope (IndexScope only, non-dynamic)
  • Thorough Javadoc explaining the purpose, safety considerations, and edge case scenarios

The past Javadoc syntax issue on lines 209-210 has been properly addressed.


915-915: LGTM! Field declaration follows established patterns.

The field is correctly declared as private final boolean, matching the non-dynamic nature of the setting and following the same pattern used for other immutable index settings in this class.


1136-1136: LGTM! Proper initialization in constructor.

The field is initialized correctly using the standard Setting.get(settings) pattern, and appropriately placed with other translog-related initializations. No settings update consumer is registered, which is correct for this non-dynamic setting.


2103-2108: LGTM! Public getter ready for consumption.

The getter method follows established conventions with clear Javadoc and proper visibility. This method is correctly positioned to be consumed by Translog and MultiSnapshot components (as indicated in the PR summary) to control translog traversal order.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep labels Dec 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

❌ Gradle check result for c968213: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: xuxiong1 <[email protected]>
Signed-off-by: xuxiong1 <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/replication/IndexLevelReplicationTests.java (1)

638-704: Consider extracting common test logic.

The test duplicates 66 lines from testSeqNoCollision() (lines 571-636), differing only in the setting on line 646. While explicit duplication in tests aids clarity, you might consider a parameterized test or helper method to reduce maintenance overhead if the test logic evolves.

Example approach using a helper method:

public void testSeqNoCollision() throws Exception {
    testSeqNoCollisionWithSettings(Settings.builder()
        .put(IndexSettings.INDEX_SOFT_DELETES_SETTING.getKey(), true)
        .put(IndexSettings.INDEX_TRANSLOG_RETENTION_AGE_SETTING.getKey(), "-1")
        .put(IndexSettings.INDEX_TRANSLOG_RETENTION_SIZE_SETTING.getKey(), "-1")
        .build());
}

public void testSeqNoCollisionWithReadForward() throws Exception {
    testSeqNoCollisionWithSettings(Settings.builder()
        .put(IndexSettings.INDEX_SOFT_DELETES_SETTING.getKey(), true)
        .put(IndexSettings.INDEX_TRANSLOG_RETENTION_AGE_SETTING.getKey(), "-1")
        .put(IndexSettings.INDEX_TRANSLOG_RETENTION_SIZE_SETTING.getKey(), "-1")
        .put(IndexSettings.INDEX_TRANSLOG_READ_FORWARD_SETTING.getKey(), true)
        .build());
}

private void testSeqNoCollisionWithSettings(Settings settings) throws Exception {
    // Common test logic here
}
server/src/test/java/org/opensearch/index/translog/LocalTranslogTests.java (1)

3876-3934: Forward-read snapshot test wiring looks correct

The test correctly:

  • Constructs a translog with INDEX_TRANSLOG_READ_FORWARD_SETTING enabled via getTranslogConfig(tempDir, settings).
  • Uses a separate LocalTranslog instance so it doesn’t interfere with the class-level translog.
  • Populates views in generation order and asserts that newSnapshot() yields the concatenated operations in forward (oldest→newest) order.

This aligns with the new forward-reading MultiSnapshot semantics and gives good coverage for multi-generation snapshots.

As a minor optional clean-up, you could factor the common setup logic between this test and testSnapshotReadOperationInReverse into a small helper to reduce duplication, but it’s not strictly necessary here.

server/src/test/java/org/opensearch/indices/recovery/RecoveryTests.java (1)

533-572: Read-forward recovery test mirrors baseline behavior appropriately

This test cleanly mirrors testRecoveryTrimsLocalTranslog while:

  • Enabling INDEX_TRANSLOG_READ_FORWARD_SETTING on the index.
  • Creating the replication group with an InternalEngineFactory, keeping the execution model aligned with the baseline test.
  • Reusing the same flow (in-flight docs, replica promotion/demotion, recovery, and consistency assertions), which is exactly what we need to validate trimming semantics under forward translog reading.

The coverage looks solid and should catch regressions specific to read-forward mode.

If you want to reduce maintenance overhead later, you could extract the common body of this test and testRecoveryTrimsLocalTranslog into a shared helper that takes the Settings (or a boolean readForward) as a parameter, but this duplication is acceptable as-is.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0449ce8 and 234b0c9.

📒 Files selected for processing (7)
  • CHANGELOG.md (1 hunks)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (4 hunks)
  • server/src/main/java/org/opensearch/index/translog/MultiSnapshot.java (2 hunks)
  • server/src/main/java/org/opensearch/index/translog/Translog.java (1 hunks)
  • server/src/test/java/org/opensearch/index/replication/IndexLevelReplicationTests.java (1 hunks)
  • server/src/test/java/org/opensearch/index/translog/LocalTranslogTests.java (1 hunks)
  • server/src/test/java/org/opensearch/indices/recovery/RecoveryTests.java (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
🔇 Additional comments (4)
server/src/main/java/org/opensearch/index/translog/Translog.java (1)

764-765: LGTM!

The implementation correctly reads the new isTranslogReadForward() setting from IndexSettings and passes it to the MultiSnapshot constructor, enabling bidirectional translog reading based on index configuration.

server/src/main/java/org/opensearch/index/IndexSettings.java (1)

914-914: Translog read‑forward flag wiring looks correct; confirm non‑dynamic lifecycle is intentional

The new translogReadForward flag is:

  • Backed by INDEX_TRANSLOG_READ_FORWARD_SETTING with default false.
  • Read once at construction time from settings.
  • Stored in a final field and exposed via isTranslogReadForward().

This means the read direction is effectively fixed for the lifetime of the IndexSettings instance and cannot be changed via a dynamic settings update (the setting also lacks Property.Dynamic and there is no update consumer).

If the intent is “configure at index creation (or close/open) only”, this is perfectly fine and keeps behavior simple. If you expect operators to toggle read‑forward on an already‑open index, you’d need to:

  • Mark the setting as Property.Dynamic.
  • Store it in a volatile field instead of final.
  • Register an update consumer on scopedSettings similar to other translog settings.

Given the sensitivity of recovery semantics, the non‑dynamic approach is probably safer, but it’s worth double‑checking that this matches your operational expectations.

Also applies to: 1135-1135, 2102-2107

server/src/main/java/org/opensearch/index/translog/MultiSnapshot.java (2)

57-70: Constructor wiring for readForward flag and index initialization looks sound

The added readForward flag and constructor wiring are consistent:

  • readForward is final, so direction is immutable per snapshot.
  • index = readForward ? 0 : translogs.length - 1; correctly handles both directions and the empty‑array case (0 vs -1 with loops guarding against out‑of‑bounds).

No correctness issues here.


84-111: Forward vs backward traversal shares correct dedupe semantics; relies on trim precondition

The new next() implementation cleanly bifurcates behavior:

  • Forward path (readForward == true): iterates index from 0 to translogs.length - 1, consuming each TranslogSnapshot in order and using seenSeqNo.getAndSet + overriddenOperations in the same way as before.
  • Backward path keeps the original behavior (from translogs.length - 1 down to 0) with identical per‑operation logic.

A few points worth noting:

  • The reuse of the same inner loop and SeqNoSet logic in both branches preserves the existing semantics of “first occurrence wins with respect to the chosen traversal order” and keeps overriddenOperations accounting correct.
  • With backward reading, “first occurrence” means “latest generation wins”, which avoids stale operations from older primary terms.
  • With forward reading, “first occurrence” means “oldest generation wins”. This matches the risk described in the new index setting Javadoc: if trimming of stale operations (trimOperationOfPreviousPrimaryTerms(...)) hasn’t happened yet, forward traversal can surface outdated ops from older primary terms.

Given that:

  • As long as forward reading is only enabled in flows where stale‑term trimming is guaranteed to have run before constructing this MultiSnapshot, this implementation is consistent with the documented behavior.
  • If you want extra safety, you could consider adding assertions or tighter invariants at the call site (e.g., around when forward snapshots are created) to ensure we don’t accidentally use forward traversal in a pre‑trim state, but that’s optional and outside this class.

Overall, the bidirectional iteration logic here is correct and symmetric.

Signed-off-by: xuxiong1 <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 234b0c9 and 73f1414.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: Analyze (java)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/IndexSettings.java (3)

915-915: LGTM! Field declaration is correct.

The field is appropriately declared as final since the setting is not dynamic.


1136-1136: LGTM! Field initialization is correct.

The initialization follows the standard pattern for non-dynamic IndexScope settings.


2103-2108: LGTM! Getter method is correctly implemented.

The method follows standard naming conventions and provides appropriate access to the setting value.

Signed-off-by: xuxiong1 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

❌ Gradle check result for fe0811d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] read forward in translog

1 participant